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ISOFORM 1 OF DIMETHYLGLYCINE DEHYDROGENASE-LIKE GENE 



FIELD OF INVENTION 
5 This invention relates to newly identified polynucleotides, polypeptides encoded by them 

and to the use of such polynucleotides and polypeptides, and to their production. More 
particularly, the polynucleotides and polypeptides of the present invention relate to the 
dimethylglycine dehydrogenase family, hereinafter referred to as dimethylglycine dehydrogenase-like 
gene. The invention also relates to inhibiting or activating the action of such polynucleotides and 
1 0 polypeptides. 



BACKGROUND OF THE INVENTION 

Sar cosine (N-methylglycine) is enzymatically formed from dimethylglycine by dimethylglycine 
dehydrogenase (EC 1 .5.99.2) and converted to glycine by sarcosine dehydrogenase (EC 1 .5.99. 1). 

1 5 Sarcosine dehydrogenase deficiency will cause sarcosinemia. This indicates that the dimethylglycine 
dehydrogenase family has an established, proven history as therapeutic targets. Clearly there is a need 
for identification and characterization of further members of the dimethylglycine dehydrogenase family 
which can play a role in preventing, ameliorating or correcting dysfunctions or diseases, including, but 
not limited to, sarcosinemia, cardiomyopathy, retinitis pigmentosa, deafness, neurological disease, 

20 cancer, metabolic defects and AIDS. 

SUMMARY OF THE INVENTION 

In one aspect, the invention relates to dimethylglycine dehydrogenase-like polypeptides and 
recombinant materials and methods for their production. Another aspect of the invention relates to 

25 methods for using such dimethylglycine dehydrogenase-like polypeptides and polynucleotides. Such uses 
include the treatment of sarcosinemia, cardiomyopathy, retinitis pigmentosa, deafness, neurological 
disease, cancer, metabolic defects and AIDS, among others. In still another aspect, the invention 
relates to methods to identify agonists and antagonists using the materials provided by the 
invention, and treating conditions associated with dimethylglycine dehydrogenase-like gene imbalance 

3 0 with the identified compounds. Yet another aspect of the invention relates to diagnostic assays for 
detecting diseases associated with inappropriate dimethylglycine dehydrogenase-like gene activity or 
levels. 
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DESCRIPTION OF THE INVENTION 
Definitions 

The following definitions are provided to facilitate understanding of certain terms used 
frequently herein. 

5 "Dimethylglycine dehydrogenase-like gene" refers, among others, generally to a polypeptide 

having the amino acid sequence set forth in SEQ ID NO:2 or an allelic variant thereof. 

"Dimethylglycine dehydrogenase-like gene activity" or "dimethylglycine dehydrogenase-like 
polypeptide activity" or "biological activity of the dimethylglycine dehydrogenase-like gene" or 
"dimethylglycine dehydrogenase-like polypeptide" refers to the metabolic or physiologic function of 

1 0 said dimethylglycine dehydrogenase-like gene including similar activities or improved activities or 
these activities with decreased undesirable side-effects. Also included are antigenic and 
immunogenic activities of said dimethylglycine dehydrogenase-like gene. 

"Dimethylglycine dehydrogenase-like gene" refers to a polynucleotide having the nucleotide 
sequence set forth in SEQ ID NO: 1 or allelic variants thereof and/or their complements. 

1 5 "Antibodies" as used herein includes polyclonal and monoclonal antibodies, chimeric, 

single chain, and humanized antibodies, as well as Fab fragments, including the products of an Fab 
or other immunoglobulin expression library. 

"Isolated" means altered "by the hand of man" from the natural state. If an "isolated" 
composition or substance occurs in nature, it has been changed or removed from its original 

20 environment, or both. For example, a polynucleotide or a polypeptide naturally present in a living 
animal is not "isolated," but the same polynucleotide or polypeptide separated from the coexisting 
materials of its natural state is "isolated", as the term is employed herein. 

"Polynucleotide" generally refers to any polyribonucleotide or polydeoxribonucleotide, 
which may be unmodified RNA or DNA or modified RNA or DNA. "Polynucleotides" include, 

25 without limitation single- and double-stranded DNA, DNA that is a mixture of single- and double- 
stranded regions, single- and double-stranded RNA, and RNA that is mixture of single- and 
double-stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded 
or, more typically, double-stranded or a mixture of single- and double-stranded regions. In 
addition, "polynucleotide" refers to triple-stranded regions comprising RNA or DNA or both RNA 

30 and DNA. The term polynucleotide also includes DNAs or RNAs containing one or more modified 
bases and DNAs or RNAs with backbones modified for stability or for other reasons. "Modified" 
bases include, for example, tritylated bases and unusual bases such as inosine. A variety of 
modifications has been made to DNA and RNA; thus, "polynucleotide" embraces chemically, 
enzymatically or metabolically modified forms of polynucleotides as typically found in nature, as 
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well as the chemical forms of DNA and RNA characteristic of viruses and cells. "Polynucleotide" 
also embraces relatively short polynucleotides, often referred to as oligonucleotides. 

"Polypeptide" refers to any peptide or protein comprising two or more amino acids joined 
to each other by peptide bonds or modified peptide bonds, i.e., peptide isosteres. "Polypeptide" 
5 refers to both short chains, commonly referred to as peptides, oligopeptides or oligomers, and to 
longer chains, generally referred to as proteins. Polypeptides may contain amino acids other than 
the 20 gene-encoded amino acids. "Polypeptides" include amino acid sequences modified either by 
natural processes, such as posttranslational processing, or by chemical modification techniques 
which are well known in the art. Such modifications are well described in basic texts and in more 
1 0 detailed monographs, as well as in a voluminous research literature. Modifications can occur 
anywhere in a polypeptide, including the peptide backbone, the amino acid side-chains and the 
amino or carboxyl termini. It will be appreciated that the same type of modification may be 
present in the same or varying degrees at several sites in a given polypeptide. Also, a given 
polypeptide may contain many types of modifications. Polypeptides may be branched as a result of 
1 5 ubiquitination, and they may be cyclic, with or without branching. Cyclic, branched and branched 
cyclic polypeptides may result from posttranslation natural processes or may be made by synthetic 
methods. Modifications include acetylation, acylation, ADP-ribosylation, amidation, covalent 
attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a nucleotide or 
nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent attachment of 
20 phosphotidylinositol, cross-linking, cyclization, disulfide bond formation, demethylation, formation 
of covalent cross-links, formation of cystine, formation of pyroglutamate, formylation, gamma- ; 
carboxylation, glycosylation, GPI anchor formation, hydroxylation, lodination, methylation, 
myristoylation, oxidation, proteolytic processing, phosphorylation, prenylation, racemization, 
selenoylation, sulfation, transfer-RNA mediated addition of amino acids to proteins such as 
25 arginylation, and ubiquitination. See, for instance, PROTEINS - STRUCTURE AND 

MOLECULAR PROPERTIES, 2nd Ed , T. E. Creighton, W. H. Freeman and Company, New 
York, 1993 and Wold, F., Posttranslational Protein Modifications: Perspectives and Prospects, 
pgs. 1-12 in POSTTRANSLATIONAL COVALENT MODIFICATION OF PROTEINS, B. C. 
Johnson, Ed., Academic Press, New York, 1983; Seifter et ai, "Analysis for protein modifications 
30 and nonprotein cofactors", Meth Enzymol ,(1990) 182:626-646 and Rattan et ai, "Protein 
Synthesis: Posttranslational Modifications and Aging", Ann NY Acad Sci (1992) 663:48-62. 

"Variant" as the term is used herein, is a polynucleotide or polypeptide that differs from a 
reference polynucleotide or polypeptide respectively, but retains essential properties. A typical 
variant of a. polynucleotide differs in nucleotide sequence from another, reference polynucleotide. 
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Changes in the nucleotide sequence of the variant may or may not alter the amino acid sequence of 
a polypeptide encoded by the reference polynucleotide. Nucleotide changes may result in amino 
acid substitutions, additions, deletions, fusions and truncations in the polypeptide encoded by the 
reference sequence, as discussed below A typical variant of a polypeptide differs in amino acid 
5 sequence from another, reference polypeptide. Generally, differences are limited so that the 
sequences of the reference polypeptide and the variant are closely similar overall and, in many 
regions, identical. A variant and reference polypeptide may differ in amino acid sequence by one or 
more substitutions, additions, deletions in any combination. A substituted or inserted amino acid 
residue may or may not be one encoded by the genetic code. A variant of a polynucleotide or 
1 0 polypeptide may be a naturally occurring such as an allelic variant, or it may be a variant that is 
not known to occur naturally. Non-naturally occurring variants of polynucleotides and 
polypeptides may be made by mutagenesis techniques or by direct synthesis. 

"Identity," as known in the art, is a relationship between two or more polypeptide sequences 
or two or more polynucleotide sequences, as determined by comparing the sequences. In the art, 
15 "identity" also means the degree of sequence relatedness between polypeptide or polynucleotide 
sequences, as the case may be, as determined by the match between strings of such sequences. 

"Identity" and "similarity" can be readily calculated by known methods, including but 
not limited to those described in (Computational Molecular Biology, Lesk, A.M., ed., Oxford 
University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, 
20 D.W., ed.. Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, 
Griffin, A.M., and Griffin, H.G., eds., Humana Press, New Jersey, 1994; Sequence Analysis 
in Molecular Biology, von Heinje, G., Academic Press, 1987; and Sequence Analysis Primer, 
Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991; and Carillo, H., 
and Lipman, D., SIAM J. Applied Math., 48: 1073 (1988). Preferred methods to determine 
25 identity are designed to give the largest match between the sequences tested. Methods to 
determine identity and similarity are codified in publicly available computer programs. 
Preferred computer program methods to determine identity and similarity between two 
sequences include, but are not limited to, the GCG program package (Devereux, J., et al., 
Nucleic Acids Research 12(1): 387 (1984)), BLASTP, BLASTN, and FASTA (Atschul, S.F. et 
30 al., 7. Molec. Biol. 215: 403-410 (1990). The BLAST X program is publicly available from 
NCBI and other sources (BLAST Manual, Altschul, S., et al., NCBI NLM NIH Bethesda, MD 
20894; Altschul, S., et al.,J. Mol. Biol. 215: 403-410 (1990). The well known Smith 
Waterman algorithm may also be used to determine identity. 



4 



1SDOCID: <WO 99475S9A1_I_> 



WO 99/47559 PCT/CN98/00040 

Preferred parameters for polypeptide sequence comparison include the following: 
1) Algorithm: Needleman and Wunsch, J. Mol Biol. 48: 443-453 (1970) 
Comparison matrix: BLOSSUM62 from Hentikoff and Hentikoff, Proc. Natl. Acad. Sci. 
USA. 89:10915-10919 (1992) 
5 Gap Penalty: 12 

Gap Length Penalty: 4 

A program useful with these parameters is publicly available as the "gap" program 
from Genetics Computer Group, Madison WI. The aforementioned parameters are the default 
parameters for polypeptide comparisons (along with no penalty for end gaps). 

10 

Preferred parameters for polynucleotide comparison include the following: 
1) Algorithm: Needleman and Wunsch, J. Mol Biol. 48: 443-453 (1970) 
Comparison matrix: matches = + 10, mismatch = 0 
Gap Penalty,: 50 
1 5 Gap Length Penalty: 3 

A program useful with these parameters is publicly available as the "gap" program 
from Genetics Computer Group, Madison WI. The aforementioned parameters are the default 
parameters for polynucleotide comparisons . 

20 Preferred polynucleotide embodiments further include an isolated polynucleotide 

comprising a polynucleotide having at least a 50,60, 70, 80, 85, 90, 95, 97 or 100% identity to 
a polynucleotide reference sequence of SEQ ID NO:l, wherein said reference sequence may be 
identical to the sequence of SEQ ID NO: 1 or may include up to a certain integer number of 
nucleotide alterations as compared to the reference sequence, wherein said alterations are 

25 selected from the group consisting of at least one nucleotide deletion, substitution, including 
transition and transversion, or insertion, and wherein said alterations may occur at the 5' or 3' 
terminal positions of the reference nucleotide sequence or anywhere between those terminal 
positions, interspersed either individually among the nucleotides in the reference sequence or in 
one or more contiguous groups within the reference sequence, and wherein said number of 

30 nucleotide alterations is determined by multiplying the total number of nucleotides in SEQ ID 
NO:l by the numerical percent of the respective percent identity and subtracting that product 
from said total number of nucleotides in SEQ ID NO: 1 , or: 

n n <x n -(x n .y), 
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25 



wherein n n is the number of nucleotide alterations, x n is the total number of nucleotides in 
SEQ ID NO:l, and y is 0.50 for 50%, 0.60 for 60%, 0.70 for 70%, 0.80 for 80%, 0.85 for 
85%, 0.90 for 90%, 0.95 for 95%, 0.97 for 97% or 1.00 for 100%, and wherein any non- 
integer product of x n and y is rounded down to the nearest integer prior to subtracting it from 
x n . Alterations of a polynucleotide sequence encoding the polypeptide of SEQ ID NO: 2 may 
create nonsense, missense or frameshift mutations in this coding sequence and thereby alter the 
polypeptide encoded by the polynucleotide following such alterations. 



Preferred polypeptide embodiments further include an isolated polypeptide comprising 
a polypeptide having at least a 50,60, 70, 80, 85, 90, 95, 97 or 100% identity to a polypeptide 
reference sequence of SEQ ID NO:2, wherein said reference sequence may be identical to the 
sequence of SEQ ID NO: 2 or may include up to a certain integer number of amino acid 
alterations as compared to the reference sequence, wherein said alterations are selected from 
1 5 the group consisting of at least one amino acid deletion, substitution, including conservative 
and non-conservative substitution, or insertion, and wherein said alterations may occur at the 
amino- or carboxy-terminal positions of the reference polypeptide sequence or anywhere 
between those terminal positions, interspersed either individually among the amino acids in the 
reference sequence or in one or more contiguous groups within the reference sequence, and 
20 wherein said number of amino acid alterations is determined by multiplying the total number of 
amino acids in SEQ ID NO:2 by the numerical percent of the respective percent identity and 
subtracting that product from said total number of amino acids in SEQ ID NO:2, or: 



n a <x a -(x a »y). 



wherein n a is the number of amino acid alterations, x a is the total number of amino acids in 
SEQ ID NO:2, and y is 0.50 for 50%, 0.60 for 60%, 0.70 for 70%, 0.80 for 80%, 0.85 for 
85%, 0.90 for 90%, 0.95 for 95%, 0.97 for 97% or 1.00 for 100%, and wherein any non- 
integer product of x a and y is rounded down to the nearest integer prior to subtracting it from 
30 x< 



*a- 



Polypeptides of the Invention 

In one aspect, the present invention relates to dimethylglycine dehydrogenase-like 
polypeptides (or dimethylglycine dehydrogenase-like proteins). The dimethylglycine dehydrogenase- 
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like polypeptides include the polypeptide of SEQ ID NO:2; as well as polypeptides comprising the 
amino acid sequence of SEQ ID NO: 2; and polypeptides comprising the ammo acid sequence 
which have at least 80% identity to that of SEQ ID NO:2 over its entire length, and still more 
preferably at least 90% identity, and even still more preferably at least 95% identity to SEQ ID 
5 NO: 2. Furthermore, those with at least 97-99% are highly preferred. Also included within 

dimethylglycine dehydrogenase-like polypeptides are polypeptides having the amino acid sequence 
which have at least 80% identity to the polypeptide having the amino acid sequence of SEQ ID 
NO:2 over its entire length, and still more preferably at least 90% identity, and still more 
preferably at least 95% identity to SEQ ID NO:2. Furthermore, those with at least 97-99% are 
1 0 highly preferred. Preferably dimethylglycine dehydrogenase-like polypeptide exhibit at least one 
biological activity of dimethylglycine dehydrogenase-like gene. 

The dimethylglycine dehydrogenase-like polypeptides may be in the form of the "mature" 
protein or may be a part of a larger protein such as a fusion protein. It is often advantageous to 
include an additional amino acid sequence which contains secretory or leader sequences, pro- 
1 5 sequences, sequences which aid in purification such as multiple histidine residues, or an additional 
sequence for stability during recombinant production. 

Fragments of the dimethylglycine dehydrogenase-like polypeptides are also included in the 
invention. A fragment is a polypeptide having an amino acid sequence that entirely is the same as part, 
but not all, of the amino acid sequence of the aforementioned dimethylglycine dehydrogenase-like 
20 polypeptides. As with dimethylglycine dehydrogenase-like polypeptides, fragments may be "free- 
standing," or comprised within a larger polypeptide of which they form a part or region, most preferably 
as a single continuous region. Representative examples of polypeptide fragments of the invention, 
include, for example, fragments from about amino acid number 1-20, 21-40, 41-60, 61-80, 81-100, and 
101 to the end of dimethylglycine dehydrogenase-like polypeptide. In this context "about" includes the 
25 particularly recited ranges larger or smaller by several, 5, 4, 3, 2 or 1 amino acid at either extreme or at 
both extremes. 

Preferred fragments include, for example, truncation polypeptides having the amino acid 
sequence of dimethylglycine dehydrogenase-like polypeptides, except for deletion of a continuous series 
of residues that includes the amino terminus, or a continuous series of residues that includes the carboxyl 
3 0 terminus or deletion of two continuous series of residues, one including the amino terminus and one 

including the carboxyl terminus. Also preferred are fragments characterized by structural or functional 
attributes such as fragments that comprise alpha-helix and alpha-helix forming regions, beta-sheet and 
beta-sheet-forming regions, turn and tom-forming regions, coil and coil-forming regions, hydrophilic 
regions, hydrophobic regions, alpha amphipathic regions, beta amphipathic regions, flexible regions, 
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surface-forming regions, substrate binding region, and high antigenic index regions. Other preferred 
fragments are biologically active fragments. Biologically active fragments are those that mediate 
dimethylglycine dehydrogenase-like gene activity, including those with a similar activity or an improved 
activity, or with a decreased undesirable activity. Also included are those that are antigenic or 
5 immunogenic in an animal, especially in a human. 

Preferably, all of these polypeptide fragments retain the biological activity of the 
dimethylglycine dehydrogenase-like gene, including antigenic activity. Variants of the defined sequence 
and fragments also form part of the present invention. Preferred variants are those that vary from the 
referents by conservative amino acid substitutions — i.e., those that substitute a residue with another of 
1 0 like characteristics. Typical such substitutions are among Ala, Val, Leu and lie; among Ser and Thr; 

among the acidic residues Asp and Glu; among Asn and Gin; and among the basic residues Lys and Arg; 
or aromatic residues Phe and Tyr. Particularly preferred are variants in which several, 5-10, 1-5, or 1-2 
amino acids are substituted, deleted, or added in any combination. 

The> dimethylglycine dehydrogenase-like polypeptides of the invention can be prepared in any 
1 5 suitable manner. Such polypeptides include isolated naturally c>ccurring polypeptides, recombinantly 

produced polypeptides, synthetically produced polypeptides, or polypeptides produced by a combination 
of these methods. Means for preparing such polypeptides are well understood in the art. 

Polynucleotides of the Invention 

Another aspect of the invention relates to dimethylglycine dehydrogenase-like polynucleotides. 
Dimethylglycine dehydrogenase-like polynucleotides include isolated polynucleotides which encode the 
dimethylglycine dehydrogenase-like polypeptides and fragments, and polynucleotides closely related 
thereto. More specifically, the dimethylglycine dehydrogenase-like polynucleotides of the invention 
include a polynucleotide comprising the nucleotide sequence contained in SEQ ID NO: 1 encoding a 
dimethylglycine dehydrogenase-like polypeptide of SEQ ID NO: 2, and polynucleotides having the 
particular sequence of SEQ ID NO: 1 . Dimethylglycine dehydrogenase-like polynucleotides further 
include a polynucleotide comprising a nucleotide sequence that has at least 80% identity over its entire 
length to a nucleotide sequence encoding the dimethylglycine dehydrogenase-like polypeptide of SEQ ID 
NO:2, and a polynucleotide comprising a nucleotide sequence that is at least 80% identical to of 
SEQ ID NO: 1 over its entire length. In this regard, polynucleotides at least 90% identical are 
particularly preferred, and those with at least 95% are especially preferred. Furthermore, those with at 
least 97% are highly preferred and those with at least 98-99% are most highly preferred, with at least 
99% being the most preferred. Also included under dimethylglycine dehydrogenase-like 
polynucleotides are a nucleotide sequence which has sufficient identity to a nucleotide sequence 
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contained in SEQ ID NO:l to hybridize under conditions useable for amplification or for use as a 
probe or marker. The invention also provides polynucleotides which are complementary to such 
dimethylglycine dehydrogenase-like polynucleotides. 

Dimethylglycine dehydrogenase-like gene of the invention is structurally related to other proteins 
of the dimethylglycine dehydrogenase family, as shown by the results of sequencing the cDNA of Table 
1 (SEQ ID NO: 1) encoding human dimethylglycine dehydrogenase-like gene. The cDNA sequence of 
SEQ ID NO: 1 contains an open reading frame (nucleotide number 133 to 1425) encoding a polypeptide 
of 43 1 amino acids of SEQ ID NO:2. The amino acid sequence of Table 2 (SEQ ID NO:2) has about 
88.8% identity (using FASTA) in 278 amino acid residues with partial rat dimethylglycine 
dehydrogenase-like protein (PJL Blache et al. Swissprot:Q64380 ). Furthermore, dimethyglycine 
dehydrogenase-like protein (SEQ ID NO:2) is 35.2% identical to rat dimethylglycine dehydrogenase over 
381 amino acid residues (H Lang et al. Eur J. Biochem. 198:793-799, 1991). The nucleotide sequence 
of Table 1 (SEQ ID NO: 1) has about 86.2%identity (using FASTA) in 836 nucleotide residues with rat 
dimethylglycine dehydrogenase-like gene (PJL Blache et al. Genbank: L79910 ). Furthermore, 
dimethylglycine dehydrogenase-like gene (SEQ ID NO: 1) is 54. 1% identical to rat dimethylglycine 
dehydrogenase over 440 nucleotide base residues (H Lang et al. Eur J. Biochem. 198:793-799, 1991). 
Thus, dimethylglycine dehydrogenase-like polypeptides and polynucleotides of the present invention are 
expected to have, inter alia, similar biological functions/properties to their homologous polypeptides and 
polynucleotides, and their utility is obvious to anyone skilled in the art. 

Table 1* 

CCTGGAGTTCCGGCCAGGCCACTGCTTGGGAAGCAAGAAGGTGAAGGCACCTCTGCTGGGCCAAGCACTCTTAGGGCCGA 

GGGGCACTGCAGCTGACAAGAGCTCCCTGTTTTGCTGAGGCCTGGAGCCCCCATGGCCTCACTGAGCCGAGCCCTACGTG S 

TGGCTGCTGCCCACCCTCGCCAGAGCCCTACCCGGGGCATGGGGCCATGCAACCTGTCCAGCGCAGCTGGCCCCACAGCC 

GAGAAGAGTGTGCCATATCAGCGGACCCTGAAGGAGGGACAGGGCACCTCGGTGGTGGCCCAAGGCCCAAGCCGGCCCCT 

GCCCAGCACGGCCAACGTGGTGGTCATTGGTGGAGGCAGCTTGGGCTGCCAGACCCTGTACCACCTGGCCAAGCTGGGCA 

TGAGTGGGGCGGTGCTGCTGGAGCGGGAGCGGCTGACCTCCGGGACCACCTGGCACACGGCAGGCCTGCTGTGGCAGCTG 

CGGCCCAGTGACGTGGAGGTGGAGCTTCTGGCCCACACTCGGCGGGTGGTGAGCCGGGAGCTGGAGGAGGAGACGGGACT 

ACACACGGGCTGGATCCAGAATGGGGGCCTCTTCATCGCGTCCAACCGGCAGCGCCTGGACGAGTACAAGAGGCTCATGT 

CGCTGGGCAAGGCGTATGGTGTGGAATCCCATGTGCTGAGCCCGGCAGAGACCAAGACTCTGTACCCGCTGATGAATGTG 

GACGACCTCTACGGGACCCTGTATGTGCCGCACGACGGTACCATGGACCCCGCTGGCACCTGTACCACCCTCGCCAGGGC 

AGCTTCTGCCCGAGGAGCACAGGTCATTGAGAACTGCCCAGTGACCGGCATTCGTGTGTGGACGGATGATTTTGGGGTGC 

GGCGGGTCGCGGGTGTGGAGACTCAGCATGGTTCCATCCAGACACCCTGCGTGGTCAATTGTGCAGGAGTGTGGGCAAGT 

GCTGTGGGCCGGATGGCTGGAGTCAAGGTCCCGCTGGTGGCCATGCACCATGCCTATGTCGTCACCGAGCGCATCGAGGG 

GATTCAGAACATGCCCAATGTCCGTGATCATGATGCCTCTGTCTACCTCCGCCTCCAAGGGGATGCCTTGTCTGTGGGTG 

GCTATGAGGCCAACCCCATCTTTTGGGAGGAGGTGTCAGACAAGTTTGCCTTCGGCCTCTTTGACCTGGACTGGGAGGTG- 

TTCACCCAGCACATTGAAGGCGCCATCAACAGGGTCCCCGTGCTGGAGAAGACAGGAATCAAGTCCACGGTCTGCGGCCC 

TGAATCCTTCACGCCCGACCACAAGCCCCTGATGGGGGAGGCACCTGAGCTCCGAGGGTTCTTCCTGGGCTGTGGCTTCA 

acagcgcagggaaggtccagacagtcctgccactcctgtttaccgtcaacgtctatctgtatctgtaggtcaggaggaca 
aacataggtcaataaatatgtaatgttagtgaacg 

a A nucleotide sequence of a human dimethylglycine dehydrogenase-like gene (SEQ ID NO: 1). 



Table 2 b 
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MASLSRALRVAAAHPRQSPTRGMGPCNLSSAAGPTAEKSVPYQRTLKEGQGTSWAQGPSRPLPSTANVWIGGGSLGCQ 
TLYHLAKLGMSGAVLLERERLTSGTTWHTAGLLMQLRPSDVEVELLAHTRRWSRELEEETGLHTGWIQNGGLFIASNRQ 
RLDEYKRLMSLGKAYGVESHVLSPAETKTLYPLMNVDDLYGTLYVPHDGTMDPAGTCTTLARAASARGAQVIENCPVTGI 
RVWTDDFGVRRVAGVETQHGSIQTPCWNCAGVWASAVGRMAGVKVPLVAMHHAYWTERIEGIQNMPNVRDHDASVYLR 
LQGDALSVGGYEANPI FVJEEVS DKFAFGLFDLDWEVFTQHIEGAINRVPVLEKTGIKSTVCGPESFTPDHKPLMGEAPEL 
RGFFLGCGFNSAGKVQTVLPLLFTVNVYLYL 

b An amino acid sequence of a human dimethylglycine dehydrogenase-like gene (SEQ ID NO: 2). 

One polynucleotide of the present invention encoding dimethylglycine dehydrogenase-like gene 
may be obtained using standard cloning and screening, from a cDNA library derived from mRNA in 
5 cells of human fetal liver using the expressed sequence tag (EST) analysis (Adams, M.D., et al. 

Science (1991) 252:165 1-1656; Adams, M.D. et al, Nature, (1992) 355:632-634; Adams, M.D., 
et al., Nature (1995) 377 Supp:3-174). Polynucleotides of the invention can also be obtained from 
natural sources such as genomic DNA libraries or can be synthesized using well known and 
commercially available techniques. 
1 0 The nucleotide sequence encoding dimethylglycine dehydrogenase-like polypeptide of SEQ 

ID NO. 2 may be identical to the polypeptide encoding sequence contained in Table 1 (nucleotide 
number 133 to 1425 of SEQ ID NO: 1), or it may be a sequence, which as a result of the redundancy 
(degeneracy) of the genetic code, also encodes the polypeptide of SEQ ID NO:2. 

When the polynucleotides of the invention are used for the recombinant production of 
1 5 dimethylglycine dehydrogenase-like polypeptide, the polynucleotide may include the coding sequence 
for the mature polypeptide or a fragment thereof, by itself; the coding sequence for the mature 
polypepude or fragment in reading frame with other coding sequences, such as those encoding a leader or 
secretory sequence, a pre-, or pro- or prepro- protein sequence, or other fusion peptide portions. For 
example, a marker sequence which facilitates purification of the fused polypepude can be encoded. In 
20 certain preferred embodiments of this aspect of the invention, the marker sequence is a hexa-histidine 
peptide, as provided in the pQE vector (Qiagen, Inc.) and described in Gentz et al., Proc Natl Acad Sci 
USA (1989) 86:821-824, or is an HA tag. The polynucleotide may also contain non-coding 5' and 3' 
sequences, such as transcribed, non-translated sequences, splicing and polyadenylation signals, ribosome 
binding sites and sequences that stabilize mRNA. 
25 Further preferred embodiments are polynucleotides encoding dimethylglycine dehydrogenase- 

like gene variants comprise the amino acid sequence dimethylglycine dehydrogenase-like polypeptide of 
Table 2 (SEQ ID NO.2) in which several, 5-10, 1-5, 1-3, 1-2 or 1 amino acid residues are substituted, 
deleted or added, in any combination. 

The present invention further relates to polynucleotides that hybridize to the herein above- 
3 0 described sequences. In this regard, the present invention especially relates to polynucleotides which 
hybridize under stringent conditions to the herein above-described polynucleotides. As herein used, the 
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term "stringent conditions" means hybridization will occur only if there is at least 80%, and preferably at 
least 90%, and more preferably at least 95%, yet even more preferably 97-99% identity between the 
sequences. 

Polynucleotides of the invention, which are identical or sufficientiy identical to a nucleotide 
5 sequence contained in SEQ ID NO: 1 or a fragment thereof, may be used as hybridization probes for 
cDNA and genomic DNA, to isolate full-length cDNAs and genomic clones encoding dimethylglycine 
dehydrogenase-like polypeptides and to isolate cDNA and genomic clones of other genes (including genes 
encoding homologs and orthologs from species other than human) that have a high sequence similarity to 
the dimethylglycine dehydrogenase-like gene. Such hybridization techniques are known to those of skill 

10 in the art. Typically these nucleotide sequences are 80% identical, preferably 90% identical, more 
preferably 95% identical to that of the referent. The probes generally will comprise at least 15 
nucleotides. Preferably, such probes will have at least 30 nucleotides and may have at least 50 
nucleotides. Particularly preferred probes will range between 30 and 50 nucleotides. 

In one embodiment, to obtain a polynucleotide encoding dimethylglycine dehydrogenase-like 

1 5 polypeptides, including homologs and orthologs from species other than human, a method comprises the 
steps of screening an appropriate library under stingent hybridization conditions with a labeled probe 
having the SEQ ID NO: 1 or a fragment thereof; and isolating full-length cDNA and genomic clones 
containing said polynucleotide sequence. Thus in another aspect, dimethylglycine dehydrogenase-like 
polynucleotides of the present invention further include a nucleotide sequence comprising a nucleotide 

20 sequence that hybridize under stringent condition to a nucleotide sequence having SEQ ID NO: 1 or a 

fragment thereof. Also included with dimethylglycine dehydrogenase-like polypeptides are polypeptides . 
comprising amino acid sequence encoded by nucleotide sequence obtained by the above hybridization 
condition. Such hybridization techniques are well known to those of skill in the art. Stringent 
hybridization conditions are as defined above or, alternatively, conditions under overnight incubation at 

25 42°C in a solution comprising: 50% formamide, 5xSSC (150mM NaCl, 15mM trisodium citrate), 50 
mM sodium phosphate (pH7.6), 5x Denhardt's solution, 10 % dextran sulfate, and 20 microgram/ml 
denatured, sheared salmon sperm DNA, followed by washing the filters in 0. lx SSC at about 65°C. 

The polynucleotides and polypeptides of the present invention may be employed as research 
reagents and materials for discovery of treatments and diagnostics to animal and human disease. 

30 

Vectors, Host Cells, Expression 

The present invention also relates to vectors which comprise a polynucleotide or polynucleotides 
of the present invention, and host cells which are genetically engineered with vectors of the invention and 
to the production of polypeptides of the invention by recombinant techniques. Cell-free translation 
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systems can also be employed to produce such proteins using RNAs derived from the DNA constructs of 
the present invention. 

For recombinant production, host cells can be genetically engineered to incorporate expression 
systems or portions thereof for polynucleotides of the present invention. Introduction of polynucleotides 
5 into host cells can be effected by methods described in many standard laboratory manuals, such as Davis 
et aL, BASIC METHODS IN MOLECULAR BIOLOGY (1986) and Sambrook et al., MOLECULAR 
CLONING: A LABORATORY MANUAL, 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, N Y. ( 1 989) such as calcium phosphate transfection, DEAE-dextran mediated transfection, 
transvection, microinjection, cationic lipid-mediated transfection, electroporation, transduction, scrape 

1 0 loading, ballistic introduction or infection. 

Representative examples of appropriate hosts include bacterial cells, such as streptococci, 
staphylococci, E. coli, Streptomyces and Bacillus subtilis cells; fungal cells, such as yeast cells and 
Aspergillus cells; insect cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as 
CHO, COS, HeLa, C 127, 3T3, BHK, HEK 293 and Bowes melanoma cells; and plant cells. 

1 5 A great variety of expression systems can be used. Such systems include, among others, 

chromosomal, episomal and virus-derived systems, e.g., vectors derived from bacterial plasmids, from 
bacteriophage, from transposons, from yeast episomes, from insertion elements, from yeast chromosomal 
elements, from viruses such as baculoviruses, papova viruses, such as SV40, vaccinia viruses, 
adenoviruses, fowl pox viruses, pseudorabies viruses and retroviruses, and vectors derived from 

20 combinations thereof, such as those derived from plasmid and bacteriophage genetic elements, such as 
cosmids and phagemids. The expression systems may contain control regions that regulate as well as 
engender expression. Generally, any system or vector suitable to maintain, propagate or express 
polynucleotides to produce a polypeptide in a host may be used. The appropriate nucleotide sequence 
may be inserted into an expression system by any of a variety of well-known and routine techniques, 

2 5 such as, for example, those set forth in Sambrook et al., MOLECULAR CLONING, A LABORATORY 

MANUAL (supra). 

For secretion of the translated protein into the lumen of the endoplasmic reticulum, into the 
periplasmic space or into the extracellular environment, appropriate secretion signals may be 
incorporated into the desired polypeptide. These signals may be endogenous to the polypeptide or they 

3 0 may be heterologous signals. 

If the dimethylglycine dehydrogenase-like polypeptide is to be expressed for use in screening 
assays, generally, it is preferred that the polypeptide be produced at the surface of the cell. In this 
event, the cells may be harvested prior to use in the screening assay. If a dimethylglycine 
dehydrogenase-like polypeptide is secreted into the medium, the medium can be recovered in order to 
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recover and purify the polypeptide; if produced intracellularly, the cells must first be lysed before 
the polypeptide is recovered. 

Dimethylglycine dehydrogenase-like polypeptides can be recovered and purified from 
recombinant cell cultures by well-known methods including ammonium sulfate or ethanol precipitation, 
5 acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, 

hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography and 
lectin chromatography. Most preferably, high performance liquid chromatography is employed for 
purification. Well known techniques for refolding proteins may be employed to regenerate active 
conformation when the polypeptide is denatured during isolation and or purification. 

10 

Diagnostic Assays 

This invention also relates to the use of dimethylglycine dehydrogenase-like polynucleotides for 
use as diagnostic reagents. Detection of a mutated form of a dimethylglycine dehydrogenase-like gene 
associated with a dysfunction will provide a diagnostic tool that can add to or define a diagnosis of a 

1 5 disease or susceptibility to a disease which results from under-expression, over-expression or altered 
expression of dimethylglycine dehydrogenase-like gene. Individuals carrying mutations in the 
dimethylglycine dehydrogenase-like gene may be detected at the DNA level by a variety of techniques. fc 

Nucleic acids for diagnosis may be obtained from a subject's cells, such as from blood, urine, 
saliva, tissue biopsy or autopsy material. The genomic DNA may be used directly for detection or may, 

20 be amplified enzymatically by using PCR or other amplification techniques prior to analysis. RNA or e . 
cDNA may also be used in similar fashion. Deletions and insertions can be detected by a change in size„ 
of the amplified product in comparison to the normal genotype. Point mutations can be identified by 
hybridizing amplified DNA to labeled dimethylglycine dehydrogenase-like gene nucleotide sequences. 
Perfectly matched sequences can be distinguished from mismatched duplexes by RNase digestion or by 

25 differences in melting temperatures. DNA sequence differences may also be detected by alterations in 

electrophoretic mobility of DNA fragments in gels, with or without denaturing agents, or by direct DNA 
sequencing. See, e.g., Myers et al, Science (1985) 230:1242. Sequence changes at specific locations 
may also be revealed by nuclease protection assays, such as RNase and S 1 protection or the chemical 
cleavage method. See Cotton et al. , Proc Natl Acad Sci USA (1985) 85: 4397-4401. In another 

30 embodiment, an array of oligonucleotide probes comprising the dimethylglycine dehydrogenase-like gene 
nucleotide sequence or fragments thereof can be constructed to conduct efficient screening of e.g., genetic 
mutations. Array technology methods are well known and have general applicability and can be used to 
address a variety of questions in molecular genetics including gene expression, genetic linkage, and 
genetic variability. (See for example: M.Chee et al., Science, Vol 274, pp 610-613 (1996)). 
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The diagnostic assays offer a process for diagnosing or determining a susceptibility to 
sarcosinemia, cardiomyopathy, retinitis pigmentosa, deafness, neurological disease, cancer, metabolic 
defects and AIDS through detection of mutation in the dimethylglycine dehydrogenase-like gene by the 
methods described. 

5 In addition, sarcosinemia, cardiomyopathy, retinitis pigmentosa, deafness, neurological disease, 

cancer, metabolic defects and AIDS can be diagnosed by methods comprising determining from a 
sample derived from a subject an abnormally decreased or increased level of dimethylglycine 
dehydrogenase-like polypeptide or dimethylglycine dehydrogenase-like mRNA. Decreased or 
increased expression can be measured at the RNA level using any of the methods well known in the 

1 0 art for the quantitation of polynucleotides, such as, for example, PCR, RT-PCR, RNase protection, 
Northern blotting and other hybridization methods. Assay techniques that can be used to determine 
levels of a protein, such as a dimethylglycine dehydrogenase-like polypeptide, in a sample derived from a 
host are well-known to those of skill in the art. Such assay methods include radioimmunoassays, 
competitive'binding assays, Western Blot analysis and ELISA assays. 

1 5 Thus in another aspect, the present invention relates to a diagonostic kit for a disease or 

suspectability to a disease, particularly sarcosinemia, cardiomyopathy, retinitis pigmentosa, deafness, 
neurological disease, cancer, metabolic defects and AIDS, which comprises: 

(a) a dimethylglycine dehydrogenase-like polynucleotide, preferably the nucleotide sequence of SEQ 
ID NO: 1, or a fragment thereof ; 
20 (b) a nucleotide sequence complementary to that of (a); 

(c) a dimethylglycine dehydrogenase-like polypeptide, preferably the polypeptide of SEQ ID NO: 2, 
or a fragment thereof; or 

(d) an antibody to a dimethylglycine dehydrogenase-like polypeptide, preferably to the polypeptide of 
SEQ ID NO: 2. 

25 It will be appreciated that in any such kit, (a), (b), (c) or (d) may comprise a substantial 
component. 

Chromosome Assays 

The nucleotide sequences of the present invention are also valuable for chromosome 
3 0 identification. The sequence is specifically targeted to and can hybridize with a particular location on an 
individual human chromosome. The mapping of relevant sequences to chromosomes according to the 
present invention is an important first step in correlating those sequences with gene associated disease. 
Once a sequence has been mapped to a precise chromosomal location, the physical position of the 
sequence on the chromosome can be correlated with genetic map data. Such data are found, for 
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example, in V. McKusick, Mendelian Inheritance in Man (available on line through Johns Hopkins 
University Welch Medical Library). The relationship between genes and diseases that have been mapped 
to the same chromosomal region are then identified through linkage analysis (coinheritance of physically 
adjacent genes). 

The differences in the cDNA or genomic sequence between affected and unaffected 
individuals can also be determined. If a mutation is observed in some or all of the affected 
individuals but not in any normal individuals, then the mutation is likely to be the causative agent 
of the disease. 

The dimethylglycine dehydrogenase-like gene is mapped to 9q34 where sarcosinemia was 
localized. 

Antibodies 

The polypeptides of the invention or their fragments or analogs thereof, or cells expressing them 
can also be used as immunogens to produce antibodies immunospecific for the dimethylglycine 
1 5 dehydrogenase-like polypeptides. The term "immunospecific" means that the antibodies have 

substantial! greater affinity for the polypeptides of the invention than their affinity for other related 
polypeptides in the prior art. 

Antibodies generated against the dimethylglycine dehydrogenase-like polypeptides can be 
obtained by administering the polypeptides or epitope-bearing fragments, analogs or cells to an animal, 
20 preferably a nonhuman, using routine protocols. For preparation of monoclonal antibodies, any 

technique which provides antibodies produced by continuous cell line cultures can be used. Examples 
include the hybridoma technique (Kohler, G. and Milstein, C, Nature (1975) 256:495-497), the triorna 
technique, the human B-cell hybridoma technique (Kozbor et al, Immunology Today (1983) 4:72) and 
the EBV-hybridoma technique (Cole et al. , MONOCLONAL ANTIBODIES AND CANCER 
25 THERAPY, pp. 77-96, Alan R Liss, Inc., 1 985). 

Techniques for the production of single chain antibodies (U.S. Patent No. 4,946,778) can also 
be adapted to produce single chain antibodies to polypeptides of this invention. Also, transgenic mice, or 
other organisms including other mammals, may be used to express humanized antibodies. 

The above-described antibodies may be employed to isolate or to identify clones expressing the 
3 0 polypeptide or to purify the polypeptides by, affinity chromatography. 

Antibodies against dimethylglycine dehydrogenase-like polypeptides may also be employed to 
treat sarcosinemia, cardiomyopathy, retinitis pigmentosa, deafness, neurological disease, cancer, 
metabolic defects and AIDS, among others. 

15 
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Vaccines 

Another aspect of the invention relates to a method for inducing an immunological 
response in a mammal which comprises inoculating the mammal with dimethylglycine 
dehydrogenase-like polypeptide, or a fragment thereof, adequate to produce antibody and/or T cell 
5 immune response to protect said animal from sarcosinemia, cardiomyopathy, retinitis pigmentosa, 
deafness, neurological disease, cancer, metabolic defects and AIDS, among others. Yet another aspect 
of the invention relates to a method of inducing immunological response in a mammal which 
comprises, delivering dimethylglycine dehydrogenase-like polypeptide via a vector directing 
expression of dimethylglycine dehydrogenase-like polynucleotide in vivo in order to induce such an 

1 0 immunological response to produce antibody to protect said animal from diseases. 

Further aspect of the invention relates to an immunological/vaccine formulation 
(composition) which, when introduced into a mammalian host, induces an immunological response 
in that mammal to a dimethylglycine dehydrogenase-like polypeptide wherein the composition 
comprises a, dimethylglycine dehydrogenase-like polypeptide or dimethylglycine dehydrogenase-like 

1 5 gene. The vaccine formulation may further comprise a suitable carrier. Since dimethylglycine 

dehydrogenase-like polypeptides may be broken down in the stomach, it'is preferably administered 
parenterally (including subcutaneous, intramuscular, intravenous, intradermal etc. injection). 
Formulations suitable for parenteral administration include aqueous and non-aqueous sterile 
injection solutions which may contain antioxidants, buffers, bacteriostats and solutes which render 

20 the formulation instonic with the blood of the recipient; and aqueous and non-aqueous sterile 

suspensions which may include suspending agents or thickening agents. The formulations may be 
presented in unit-dose or multi-dose containers, for example, sealed ampoules and vials and may be 
stored in a freeze-dried condition requiring only the addition of the sterile liquid carrier immediately 
prior to use. The vaccine formulation may also include adjuvant systems for enhancing the 

25 immunogenicity of the formulation, such as oil-in water systems and other systems known in the 

art. The dosage will depend on the specific activity of the vaccine and can be readily determined by 
routine experimentation. 

Screening Assays 

3 0 The dimethylglycine dehydrogenase-like polypeptide of the present invention may be employed 

in a screening process for compounds which activate (agonists) or inhibit activation of (antagonists, or 
otherwise called inhibitors) the dimethylglycine dehydrogenase-like polypeptide of the present invention. 
Thus, polypeptides of the invention may also be used to assess identify agonist or antagonists from, for 
example, cells, cell-free preparations, chemical libraries, and natural product mixtures. These agonists 
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or antagonists may be natural or modified substrates, ligands, receptors, enzymes, etc., as the case may 
be, of the polypeptide of the present invention; or may be structural or functional mimetics of the 
polypeptide of the present invention. See Coligan et al., Current Protocols in Immunology l(2):Chapter 
5 (1991). 

5 Dimethylglycine dehydrogenase-like polypeptides are responsible for many biological functions, 

including many pathologies. Accordingly, it is desirous to find compounds and drugs which stimulate 
dimethylglycine dehydrogenase-like polypeptides on the one hand and which can inhibit the function of 
dimethylglycine dehydrogenase-like polypeptides on the other hand. .. In general, agonists are employed 
for therapeutic and prophylactic purposes for such conditions as sarcosinemia, cardiomyopathy, retinitis 
10 pigmentosa, deafness, neurological disease, cancer, metabolic defects and AIDS. Antagonists may be 
employed for a variety of therapeutic and prophylactic purposes for such conditions as sarcosinemia, 
cardiomyopathy, retinitis pigmentosa, deafness, neurological disease, cancer, metabolic defects and 
AIDS. 

In general, such screening procedures may involve using appropriate cells which express the 
1 5 dimethylglycine dehydrogenase-like polypeptide or respond to dimethylglycine dehydrogenase-like 
polypeptides of the present invention. Such cells include cells from mammals, yeast, Drosophila or 
Rcoli. Cells which express the dimethylglycine dehydrogenase-like polypeptide (or cell membrane r . 
containing the expressed polypeptide) or respond to dimethylglycine dehydrogenase-like polypeptide are 
then contacted with a test compound to observe binding, or stimulation or inhibition of a functional , 
20 response. The ability of the cells which were contacted with the candidate compounds is compared with 
the same cells which were not contacted for dimethylglycine dehydrogenase-like activity. 

The assays may simply test binding of a candidate compound wherein adherence to the 
cells bearing the dimethylglycine dehydrogenase-like polypeptide is detected by means of a label 
directly or indirectly associated with the candidate compound or in an assay involving competition 
25 with a labeled competitor. Further, these assays may test whether the candidate compound results 
in a signal generated by activation of the dimethylglycine dehydrogenase-like polypeptide, using 
detection systems appropriate to the cells bearing the dimethylglycine dehydrogenase-like 
polypeptide. Inhibitors of activation are generally assayed in the presence of a known agonist and 
the effect on activation by the agonist by the presence of the candidate compound fs observed. 
30 Further, the assays may simply comprise the steps of mixing a candidate compound with a 

solution containing a dimethylglycine dehydrogenase-like polypeptide to form a mixture, measuring 
dimethylglycine dehydrogenase-like gene activity in the mixture, and comparing the dimethylglycine 
dehydrogenase-like gene activity of the mixture to a standard. 
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The dimethylglycine dehydrogenase-like gene cDNA, protein and antibodies to the protein 
may also be used to configure assays for detecting the effect of added compounds on the production 
of dimethylglycine dehydrogenase-like gene mRNA and protein in cells. For example, an ELISA may 
be constructed for measuring secreted or cell associated levels of dimethylglycine dehydrogenase-like 
5 protein using monoclonal and polyclonal antibodies by standard methods known in the art, and this 
can be used to discover agents which may inhibit or enhance the production of dimethylglycine 
dehydrogenase-like gene (also called antagonist or agonist, respectively) from suitably manipulated 
cells or tissues. 

The dimethylglycine dehydrogenase-like protein may be used to identify membrane bound or 

1 0 soluble receptors, if any, through standard receptor binding techniques known in the art. These 

include, but are not limited to, ligand binding and crosslinking assays in which the dimethylglycine 
dehydrogenase-like gene is labeled with a radioactive isotope (eg 1251), chemically modified (eg 
biotinylated), or fused to a peptide sequence suitable for detection or purification, and incubated 
with a source of the putative receptor (cells, cell membranes, cell supernatants, tissue extracts, 

1 5 bodily fluids). Other methods include biophysical techniques such as surface plasmon resonance 
and spectroscopy. In addition to being used for purification and cloning of the receptor, these 
binding assays can be used to identify agonists and antagonists of dimethylglycine dehydrogenase- 
like genes which compete with the binding of the dimethylglycine dehydrogenase-like gene to its 
receptors, if any. Standard methods for conducting screening assays are well understood in the art. 

20 Examples of potential dimethylglycine dehydrogenase-like polypeptide antagonists include 

antibodies or, in some cases, oligonucleotides or proteins which are closely related to the ligands, 
substrates, receptors, enzymes, etc., as the case may be, of the dimethylglycine dehydrogenase-like 
polypeptide, e.g., a fragment of the ligands, substrates, receptors, enzymes, etc.; or small molecules 
which bind to the polypetide of the present invention but do not elicit a response, so that the activity of 

25 the polypeptide is prevented. 

Thus in another aspect, the present invention relates to a screening kit for identifying 
agonists, antagonists, ligands, receptors, substrates, enzymes, etc. for dimethylglycine 
dehydrogenase-like polypeptides; or compounds which decrease or enhance the production of 
dimethylglycine dehydrogenase-like polypeptides, which comprises: 

30 (a) a dimethylglycine dehydrogenase-like polypeptide, preferably that of SEQ ID NO:2; 

(b) a recombinant cell expressing a dimethylglycine dehydrogenase-like polypeptide, preferably that 
ofSEQIDNO.2; 

(c) a cell membrane expressing a dimethylglycine dehydrogenase-like polypeptide; preferably that of 
SEQ ID NO: 2; or 
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(d) antibody to a dimethylglycine dehydrogenase-like polypeptide, preferably that of SEQ ID NO: 2. 
It will be appreciated that in any such kit, (a), (b), (c) or (d) may comprise a substantial 
component. 

5 Prophylactic and Therapeutic Methods 

This invention provides methods of treating abnormal conditions such as, sarcosinemia, 
cardiomyopathy, retinitis pigmentosa, deafness, neurological disease, cancer, metabolic defects and 
AIDS, related to both an excess of and insufficient amounts of dimethylglycine dehydrogenase-like 
polypeptide activity. 

10 If the activity of dimethylglycine dehydrogenase-like polypeptide is in excess, several 

approaches are available. One approach comprises adrninistering to a subject an inhibitor compound 
(antagonist) as hereinabove described along with a pharmaceutically acceptable carrier in an amount 
effective to inhibit the function of the dimethylglycine dehydrogenase-like polypeptide, such as, for 
example, by blocking the binding of ligands, substrates, receptors, enzymes, etc., or by inhibiting a 

1 5 second signal, and thereby alleviating the abnormal condition. In another approach, soluble forms of 
dimethylglycine dehydrogenase-like polypeptides still capable of binding the ligand, substrate, 
enzymes, receptors, etc. in competition with endogenous dimethylglycine dehydrogenase-like ~ 
polypeptides may be administered. Typical embodiments of such competitors comprise fragments 
of the dimethylglycine dehydrogenase-like polypeptide. r 

20 In still another approach, expression of the gene encoding endogenous dimethylglycine 

dehydrogenase-like polypeptides can be inhibited using expression blocking techniques. Known such 
techniques involve the use of antisense sequences, either internally generated or separately 
administered. See, for example, O'Connor, J Neurochem (1991) 56:560 in Oligodeoxvnucleotides 
as Antisense Inhibitors of Gene Expression . CRC Press, Boca Raton, FL (1988). Alternatively, 

25 oligonucleotides which form triple helices with the gene can be supplied. See, for example, Lee et 
al.. Nucleic Acids Res (1979) 6:3073; Cooney et al.. Science (1988) 241:456; Dervan et al. 
Science (1991) 251:1360. These oligomers can be administered per se or the relevant oligomers 
can be expressed in vivo. 

For treating abnormal conditions related to an under-expression of dimethylglycine 

30 dehydrogenase-like gene and its activity, several approaches are also avails One approach comprises 
adrninistering to a subject a therapeutically effective amount of a compound which activates the 
dimethylglycine dehydrogenase-like polypeptide, i.e., an agonist as described above, in combination with 
a pharmaceutically acceptable carrier, to thereby alleviate the abnormal condition. Alternatively, gene 
therapy may be employed to effect the endogenous production of dimethylglycine dehydrogenase-like 
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gene by the relevant cells in the subject. For example, a polynucleotide of the invention may be 
engineered for expression in a replication defective retroviral vector, as discussed above. The retroviral 
expression construct may then be isolated and introduced into a packaging cell transduced with a 
retroviral plasmid vector containing RNA encoding a polypeptide of the present invention such that the 
5 packaging cell now produces infectious viral particles containing the gene of interest. These producer 
cells may be administered to a subject for engineering cells in vivo and expression of the polypeptide in 
vivo. For overview of gene therapy, see Chapter 20, Gene Therapy and other Molecular Genetic-based 
Therapeutic Approaches, (and references cited therein) in Human Molecular Genetics, T Strachan and 
A P Read, BIOS Scientific Publishers Ltd (1996). Another approach is to administer a therapeutic 
1 0 amount of dimethylglycine dehydrogenase-like polypeptides in combination with a suitable 
pharmaceutical carrier. 

Formulation and Administration 

Peptides, such as the soluble form of dimethylglycine dehydrogenase-like polypeptides, and 

1 5 agonists and antagonist peptides or small molecules, may be formulated in combination with a suitable 
pharmaceutical carrier. Such formulations comprise a therapeutically effective amount of the 
polypeptide or compound, and a pharmaceutically acceptable carrier or excipient. Such carriers include 
but are not limited to, saline, buffered saline, dextrose, water, glycerol, ethanol, and combinations 
thereof. Formulation should suit the mode of administration, and is well within the skill of the art. The 

20 invention further relates to pharmaceutical packs and kits comprising one or more containers filled with 
one or more of the ingredients of the aforementioned compositions of the invention. 

Polypeptides and other compounds of the present invention may be employed alone or in 
conjunction with other compounds, such as therapeutic compounds. 

Preferred forms of systemic administration of the pharmaceutical compositions include injection, 

25 typically by intravenous injection. Other injection routes, such as subcutaneous, intramuscular, or 

intraperitoneal, can be used. Alternative means for systemic administration include transmucosal and 
transdermal administration using penetrants such as bile salts or fusidic acids or other detergents. In 
addition, if properly formulated in enteric or encapsulated formulations, oral administration may also be 
possible. Administration of these compounds may also be topical and/or localised, in the form of salves, 

30 pastes, gels and the like. 

The dosage range required depends on the choice of peptide, the route of administration, the 
nature of the formulation, the nature of the subject's condition, and the judgment of the attending 
practitioner. Suitable dosages, however, are in the range of 0. 1-100 ug/kg of subject. Wide variations in 
the needed dosage, however, are to be expected in view of the variety of compounds available and the 
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differing efficiencies of various routes of administration. For example, oral administration would be 
expected to require higher dosages than adininistration by intravenous injection. Variations in these 
dosage levels can be adjusted using standard empirical routines for optimization, as is well understood in 
the art. 

5 Polypeptides used in treatment can also be generated endogenously in the subject, in treatment 

modalities often referred to as "gene therapy" as described above. Thus, for example, cells from a 
subject may be engineered with a polynucleotide, such as a DNA or RNA, to encode a polypeptide ex 
vivo, and for example, by the use of a retroviral plasmid vector. The cells are then introduced into the 
subject. 

0 

All publications, including but not limited to patents and patent applications, cited in this 
specification are herein incorporated by reference as if each individual publication were specifically 
and individually indicated to be incorporated by reference herein as though fully set forth. 
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1 . An isolated polynucleotide comprising a nucleotide sequence that has at least 80% 
identity over its entire length to a nucleotide sequence encoding the dimethylglycine dehydrogenase-like 

5 polypeptide of SEQ ID NO:2; or a nucleotide sequence complementary to said isolated polynucleotide. 

2. The polynucleotide of claim 1 wherein said polynucleotide comprises the 
nucleotide sequence contained in SEQ ID NO: 1 encoding the dimethylglycine dehydrogenase-like 
polypeptide of SEQ ID N02. 

10 

3. The polynucleotide of claim 1 wherein said polynucleotide comprises a nucleotide 
sequence that is at least 80% identical to that of SEQ ID NO. 1 over its entire length. 

4. The polynucleotide of claim 3 which is polynucleotide of SEQ ID NO: 1. 

15 

5. The polynucleotide of claim 1 which is DNA or RNA. 

6. A DNA or RNA molecule comprising an expression system, wherein said 
expression system is capable of producing a dimethylglycine dehydrogenase-like polypeptide 

20 comprising an amino acid sequence, which has at least 80% identity with the polypeptide of SEQ ID 
NO:2 when said expression system is present in a compatible host cell. 

7. A host cell comprising the expression system of claim 6. 

25 8. A process for producing a ciimethylglycine dehydrogenase-like polypeptide 

comprising culturing a host of claim 7 under conditions sufficient for the production of said 
polypeptide and recovering the polypeptide from the culture. 

9. A process for producing a cell which produces a dimethylglycine dehydrogenase-like 
30 polypeptide thereof comprising transforming or transfecting a host cell with the expression system 
of claim 6 such that the host cell, under appropriate culture conditions, produces a dimethylglycine 
dehydrogenase-like polypeptide. 
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10. A dimethylglycine dehydrogenase-like polypeptide comprising an amino acid 
sequence which is at least 80% identical to the amino acid sequence of SEQ ID NO:2 over its entire 
length. 

1 1 . The polypeptide of claim 10 which comprises the amino acid sequence of SEQ ID 

NO.2. 

12. An antibody immunospecific for the dimethylglycine dehydrogenase-like polypeptide 
of claim 10. 

13. A method for the treatment of a subject in need of enhanced activity or expression 
of the dimethylglycine dehydrogenase-like polypeptide of claim 10 comprising: 

(a) administering to the subject a therapeutically effective amount of an agonist to said 
polypeptide;' and/or 

(b) providing to the subject an isolated polynucleotide comprising a nucleotide sequence 
that has at least 80% identity to a nucleotide sequence encoding the dimethylglycine dehydrogenase-like 
polypeptide of SEQ ID NO:2 over its entire length; or a nucleotide sequence complementary to said 
nucleotide sequence in a form so as to effect production of said polypeptide activity in vivo. 

14. A method for the treatment of a subject having need to inhibit activity or 
expression of the dimethylglycine dehydrogenase-like polypeptide of claim 1 0 comprising: 

(a) administering to the subject a therapeutically effective amount of an antagonist to 
said polypeptide; and/or 

(b) administering to the subject a nucleic acid molecule that inhibits the expression of 
the nucleotide sequence encoding said polypeptide; and/or 

(c) administering to the subject a therapeutically effective amount of a polypeptide 
that competes with said polypeptide for its ligand, substrate , or receptor. 

15. A process for diagnosing a disease or a susceptibility to a disease in a subject 

30 related to expression or activity of the dimethylglycine dehydrogenase-like polypeptide of claim 10 in a 
subject comprising: 

(a) determining the presence or absence of a mutation in the nucleotide sequence 
encoding said dimethylglycine dehydrogenase-like polypeptide in the genome of said subject; and/or 
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(b) analyzing for the presence or amount of the dimethylglycine dehydrogenase-like 
polypeptide expression in a sample derived from said subject. 

16. A method for identifying compounds which inhibit (antagonize) or agonize the 
5 dimethylglycine dehydrogenase-like polypeptide of claim 10 which comprises: 

(a) contacting a candidate compound with cells which express the dimethylglycine 
dehydrogenase-like polypeptide (or cell membrane expressing dimethylglycine dehydrogenase-like 
polypeptide) or respond to the dimethylglycine dehydrogenase-like polypeptide; and 

(b) observing the binding, or stimulation or inhibition of a functional response; or 
1 0 comparing the ability of the cells (or cell membrane) which were contacted with the candidate 

compounds with the same cells which were not contacted for dimethylglycine dehydrogenase-like 
polypeptide activity. 

17. An agonist identified by the method of claim 16. 

15 

18. An antagonist identified by the method of claim 16. 

19. A recombinant host cell produced by a method of Claim 9 or a membrane thereof 
expressing a dimethylglycine dehydrogenase-like polypeptide. 
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(1) GENERAL INFORMATION 



(i) APPLICANT: Hunan Medical University 
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DEHYDROGENASE- LIKE GENE 

(iii) NUMBER OF SEQUENCES: 2 

15 (iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: RATNER & PRESTIA 

(B) STREET: P.O. BOX 980 

(C) CITY: VALLEY FORGE 

(D) STATE: PA 

20 (E) COUNTRY: USA 

(F) ZIP : 19482 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: ' Diskette 

25 (B) COMPUTER: IBM Compatible 

(C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 
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(B) FILING DATE: 
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35 (A) APPLICATION NUMBER: 

(B) FILING DATE: 



(viii) ATTORNEY/ AGENT INFORMATION: 
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(B) REGISTRATION NUMBER: 23,031 
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(ix) TELECOMMUNICATION INFORMATION: 
45 (A) TELEPHONE: 610-4 07-0700 

(B) TELEFAX: 610-407-0701 

(C) TELEX: 846169 



SDOCID: <WO 9947559A1_I._> 



WO 99/47559 



PCT/CN98/00040 



10 



40 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1475 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



CCTGGAGTTC CGGCCAGGCC ACTGCTTGGG AAGCAAGAAG GTGAAGGCAC CTCTGCTGGG 60 

CCAAGCACTC TTAGGGCCGA GGGGCACTGC AGCTGACAAG AGCTCCCTGT TTTGCTGAGG 120 

15 CCTGGAGCCC CCATGGCCTC ACTGAGCCGA GCCCTACGTG TGGCTGCTGC CCACCCTCGC 18 0 

CAGAGCCCTA CCCGGGGCAT GGGGCCATGC AACCTGTCCA GCGCAGCTGG CCCCACAGCC 24 0 

GAGAAGAGTG T G C CAT AT C A GCGGACCCTG AAGGAGGGAC AGGGCACCTC GGTGGTGGCC 300 

CAAGGCCCAA GCCGGCCCCT GCCCAGCACG GCCAACGTGG TGGTCATTGG TGGAGGCAGC 360 

TTGGGCTGCC AGACCCTGTA CCACCTGGCC AAGCTGGGCA TGAGTGGGGC GGTGCTGCTG 42 0 

20 GAGCGGGAGC GGCTGACCTC CGGGACCACC TGGCACACGG CAGGCCTGCT GTGGCAGCTG 480 

CGGCCCAGTG ACGTGGAGGT GGAGCTTCTG GCCCACACTC GGCGGGTGGT GAGCCGGGAG 54 0 

CTGGAGGAGG AGACGGGACT ACACACGGGC T G GAT C C AG A ATGGGGGCCT CTTCATCGCG 600 

TCCAACCGGC AGCGCCTGGA CGAGTACAAG AGGCTCATGT CGCTGGGCAA GGCGTATGGT 660 

GTGGAATCCC ATGTGCTGAG CCCGGCAGAG ACCAAGACTC TGTACCCGCT GATGAATGTG 720 

25 GACGACCTCT ACGGGACCCT GTATGTGCCG CACGACGGTA CCATGGACCC CGCTGGCACC 780 

TGTACCACCC TCGCCAGGGC AGCTTCTGCC CGAGGAGCAC AGGTCATTGA GAACTGCCCA 840 

GTGACCGGCA TTCGTGTGTG GACGGATGAT TTTGGGGTGC GGCGGGTCGC GGGTGTGGAG 900 

ACTCAGCATG GTTCCATCCA GACACCCTGC GTGGTCAATT GTGCAGGAGT GTGGGCAAGT 960 

GCTGTGGGCC GGATGGCTGG AGTCAAGGTC CCGCTGGTGG CCATGCACCA TGCCTATGTC 1020 

30 GTCACCGAGC GCATCGAGGG GATTCAGAAC ATGCCCAATG TCCGTGATCA TGATGCCTCT 1080 

GTCTACCTCC GCCTCCAAGG GGATGCCTTG TCTGTGGGTG GCTATGAGGC CAACCCCATC 114 0 

TTTTGGGAGG AGGTGTCAGA CAAGTTTGCC TTCGGCCTCT TTGACCTGGA CTGGGAGGTG 1200 

TTCACCCAGC ACATTGAAGG CGCCATCAAC AGGGTCCCCG TGCTGGAGAA GACAGGAATC 1260 

AAGTCCACGG TCTGCGGCCC TGAATCCTTC ACGCCCGACC ACAAGCCCCT GATGGGGGAG 1320 

35 GCACCTGAGC TCCGAGGGTT CTTCCTGGGC TGTGGCTTCA ACAGCGCAGG GAAGGTCCAG 138 0 

ACAGTCCTGC CACTCCTGTT TACCGTCAAC GTCTATCTGT ATCTGTAGGT CAGGAGGACA 1440 

AACATAGGTC AATAAATATG TAATGTTAGT GAACG 1475 



(2) INFORMATION FOR SEQ ID NO:2l 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 431 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
45 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

2 
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Met Ala Ser Leu Ser Arg Ala Leu Arg Val Ala Ala Ala His Pro Arg 

1 5 10 15 

Gin Ser Pro Thr Arg Gly Met Gly Pro Cys Asn Leu Ser Ser Ala Ala 
5 20 25 30 

Gly Pro Thr Ala Glu Lys Ser Val Pro Tyr Gin Arg Thr Leu Lys Glu 

35 40 45 

Gly Gin Gly Thr Ser Val Val Ala Gin Gly Pro Ser Arg Pro Leu Pro 
50 55 60 

10 Ser Thr Ala Asn Val Val Val He Gly Gly Gly Ser Leu Gly Cys Gin 
65 70 75 80 

Thr Leu Tyr His Leu Ala Lys Leu Gly Met Ser Gly Ala Val Leu Leu 

85 90 95 

Glu Arg Glu Arg Leu Thr Ser Gly Thr Thr Trp His Thr Ala Gly Leu 
15 100 105 110 

Leu Trp Gin Leu Arg Pro Ser Asp Val Glu Val Glu Leu Leu Ala His 

115 120 125 

Thr Arg Arg Val Val Ser Arg Glu Leu Glu Glu Glu Thr Gly Leu His 
130 135 140 

20 Thr Gly Trp He Gin Asn Gly Gly Leu Phe He Ala Ser Asn Arg Gin 
145 150 155 160 

Arg Leu Asp Glu Tyr Lys Arg Leu Met Ser Leu Gly Lys Ala Tyr Gly 

165 170 175 

Val Glu Ser His Val Leu Ser Pro Ala Glu Thr Lys Thr Leu Tyr Pro 
25 180 185 190 

Leu Met Asn Val Asp Asp Leu Tyr Gly Thr Leu Tyr Val Pro His Asp 

195 200 205 

Gly Thr Met Asp Pro Ala Gly Thr Cys Thr Thr Leu Ala Arg Ala Ala 
210 215 220 

30 Ser Ala Arg Gly Ala Gin Val He Glu Asn Cys Pro Val Thr Gly He 
225 230 235 240 

Arg Val Trp Thr Asp Asp Phe Gly Val Arg Arg Val Ala Gly Val Glu 

245 250 255 

Thr Gin His Gly Ser He Gin Thr Pro Cys Val Val Asn Cys Ala Gly 
35 260 265 270 

Val Trp Ala Ser Ala Val Gly Arg Met Ala Gly Val Lys Val Pro Leu 

275 280 285 

Val Ala Met His His Ala Tyr Val Val Thr Glu Arg He Glu Gly He 
290 295 300 

40 Gin Asn Met Pro Asn Val Arg Asp His Asp Ala Ser Val Tyr Leu Arg 
305 310 315 320 

Leu Gin Gly Asp Ala Leu Ser Val- Gly Gly Tyr Glu Ala Asn Pro lie 

325 330 335 

Phe Trp Glu Glu Val Ser Asp Lys Phe Ala Phe Gly Leu Phe Asp Leu 
45 340 345 350 

Asp Trp Glu Val Phe Thr Gin His He Glu Gly Ala He Asn Arg Val 

355 360 365 

Pro Val Leu Glu Lys Thr Gly He Lys Ser Thr Val Cys Gly Pro Glu 

3 
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370 375 380 

Ser Phe Thr Pro Asp His Lys Pro Leu Met Gly Glu Ala Pro Glu Leu 
385 390 395 400 

Arg Gly Phe Phe Leu Gly Cys Gly Phe Asn Ser Ala Gly Lys Val Gin 
5 405 410 415 

Thr Val Leu Pro Leu Leu Phe Thr Val Asn Val Tyr Leu Tyr Leu 
420 425 430 
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